Pronunciation variant-based multi-path HMMs for syllables
نویسندگان
چکیده
Recent research suggests that it is more appropriate to model pronunciation variation with syllable-length acoustic models than with context-dependent phones. Due to the large number of factors contributing to pronunciation variation at the syllable level, the creation of multi-path model topologies appears necessary. In this paper, we propose a novel approach for constructing multi-path models for frequent syllables. The suggested approach uses phonetic knowledge for the initialisation of the parallel paths, and a data-driven solution for their re-estimation. When applied to 94 frequent syllables in a 37-hour corpus of Dutch read speech, it leads to improved recognition performance when compared with a triphone recogniser of similar complexity.
منابع مشابه
Whither Linguistic Interpretation of Acoustic Pronunciation Variation
Recent research suggests that modelling pronunciation variation is more appropriate at the syllable level than at the level of contextdependent phones. Due to the large number of factors affecting syllable pronunciation, the creation of multi-path topologies is nec essary. Previous research on multi-path models in connected digit recognition has proved trajectory clustering to be an attractive...
متن کاملSyllable-based Automatic Arabic Speech Recognition
In this paper, we concentrate on the automatic recognition of Egyptian Arabic speech using syllables. Arabic spoken digits were described by showing their constructing phonemes, triphones, syllables and words. Speaker-independent hidden markov models (HMMs)-based speech recognition system was designed using Hidden markov model toolkit (HTK). The database used for both training and testing consi...
متن کاملMulti-path Syllable Models Based on Phonetic Knowledge
Recent research suggests that syllable-length acoustic models might be more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. In this paper, we compare the recognition performance of two types of recognisers: a conventional recogniser that only uses triphones, and an experimental recogniser that employs a mix ...
متن کاملSyllable-based Automatic Arabic Speech Recognition in Noisy-telephone Channel
The performance of well-trained speech recognizers using high quality full bandwidth speech data is usually degraded when used in real world environments. In particular, telephone speech recognition is extremely difficult due to the limited bandwidth of transmission channels. In this paper, we concentrate on the telephone recognition of Egyptian Arabic speech using syllables. Arabic spoken digi...
متن کاملGame-based Teaching of Stress Placement on Multi-syllabic English Words
Accurate pronunciation is an important component of language ability and the main outward linguistic sign of whether someone is a native speaker of a language or not. An area of particular difficulty for Persian-speaking learners of English, which may cause 'foreign accent' or misunderstanding in speaking, is placement of stress on multi-syllable words. Game-based pronunciation teaching can be ...
متن کامل